Model Selection

End-to-End Segmentation

# End-to-End Segmentation

Coco Panoptic Eomt Giant 640

The model proposed in this paper reveals the potential of Vision Transformer (ViT) in image segmentation tasks.

Image Segmentation

This is an end-to-end speaker segmentation model for voice activity detection, overlap speech detection, and resegmentation tasks.

Audio Processing

A voice activity detection model based on pyannote.audio, used to identify active speech segments in audio

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase